Use weak references to refer to the original data in the accessor #880
Labels
evalml
EvalML request
needs design
Issues requiring design documentation.
new feature
suggestions for new functionality
As a user of woodwork, I noticed that the table and column accessors use a strong reference to the original dataframe/series. This prevents the garbage collector from freeing up the memory taken up by the original data because the reference count is always at least 1 since the accessor always points to the original data. We should use a weak reference to allow the garbage collector to free up the memory. To see how this would work, see #881
In order to convince myself this was happening, I used the following script I took from this blog post.
The
leaky_list
andnon_leaky_list
functions were added to sanity check that only leaky objects appear ingc.garbage
.The output should be this - and we can see the pandas dataframe with woodwork is now "uncollectable":
The text was updated successfully, but these errors were encountered: